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Abstract — In this paper, we study, information theoretically, the 
impact of transmitter and or receiver cognition on the channel 
capacity. The cognition can be described by state information, 
dependent on the channel noise and or input. Specifically, as a 
new idea, we consider the receiver cognition as a state information 
dependent on the noise and we derive a capacity theorem based 
on the Gaussian version of the Cover-Chiang capacity theorem 
for two-sided state information channel. As intuitively expected, 
the receiver cognition increases the channel capacity and our 
theorem shows this increase quantitatively. Also, our capacity 
theorem includes the famous Costa theorem as its special cases. 

Index Terms — transmitter-receiver cognition, Gaussian chan- 
nel capacity, correlated side information. 



I. Introduction 

Information theoretic study of the impact of transmitter and 
or receiver cognition on the channel capacity is a new idea 
and an important research issue. For example one channel 
from view points of two receivers with different cognition and 
information on the channel, may have different capacities. The 
cognition at the transmitter or receiver can be described by the 
usual concept of information theory i.e., side information. 

Side information channels have been extensively studied 
since the initiation by Shannon (TJ and the subsequent study 
by Kusnetsov-Tsybakov pi. The capacity of channel with 
side information (CSI) known causally only at the transmitter 
and only at the receiver has been determined by Gel'fand- 
Pinsker(GP) [3] and Heegard-El Gamal [4] respectively. Con- 
sidering the GP theorem for the Gaussian channel, Costa [5] 
obtained an interesting result, i.e., the channel capacity in 
the presence of interference known at the transmitter is the 
same as the case without interference. Having extended the 
above results, Cover-Chiang |6| established a general capacity 
theorem for the channel with two-sided state information. We 
have many other important researches in the literature, e.g. 
||7j-j9]. The results obtained for side information point to 
point channel have been extended, at least at special cases, 
to multiuser channels JT0|-|fT4l. 

As mentioned above, our motivation was the fact that 
cognition of the transmitter and receiver can affect the channel 
capacity. In order to quantify this effect, we illustrate the 
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Fig. 1 . Channel with side information available non-causally at the transmitter 
and at the receiver. 



cognition as state information dependent on the channel noise 
and or input. Then we derive a capacity theorem and prove 
that, as expected, the receiver cognition increases the channel 
capacity and our theorem shows this increase quantitatively. 
Our capacity theorem, while revealing the importance of Costa 
theorem, is a more general theorem and includes the Costa 
theorem as special cases. 

In the remainder of this section we briefly review the Cover- 
Chiang, the Gel'fand-Pinsker and the Costa theorems. 

Cover-Chiang Theorem: Fig[T]shows a channel with side in- 
formation known at the transmitter and at the receiver. X n and 
Y n are the transmitted and received sequences respectively. 
The sequences S™ and S 2 are the side information known non- 
causally at the transmitter and at the receiver respectively. The 
transition probability of the channel p(y \ x,sx,S2) depends 
on the input X, the side information Si and S 2 - If the channel 
is memoryless and the sequences (S^S^) are independent 
and identically distributed (i.i.d.) random variables under 
P (si, S2), then the capacity of the channel is jfjj: 



C= max [I(U;S 2 ,Y)-I(U;Si) 

p(u,a;|si) 

where the maximum is over all distributions: 



(1) 



p (y, x, u, si,s 2 ) =p{y \x, si,s 2 ) p(u,x \ si) p (si, s 2 ) (2) 

and U is an auxiliary random variable for conveying the 
information of the known S" into X n . 

It is important to note that the Markov chain: 
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(3) 



is satisfied for all above distributions. 

Gel'fand-Pinsker Theorem: The situation S 2 = <fi (no side 
information at the receiver) leads to the Gel'fand-Pinsker the- 
orem [3|:The memoryless channel with transition probability 
p(y I x, Si) and the side information sequence S 1 " (which is 
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Fig. 2. Channel with side information known at the transmitter. 
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Fig. 3. Gaussian channel with additive interference known at the transmitter. 

i.i.d. ~ p{s\)) known non-causally at the transmitter (Fig|2]) 
has the capacity 



C= max [I(U;Y)- I(U;Si)] 

p(u,x\si) 



for all distributions: 



P{y,x,u, =p(y\x,si)p(u,x\ si)p(si) 



(4) 



(5) 



where U is an auxiliary random variable. 

Costa's "Writing on Dirty Paper": Costa (5J examined the 
Gaussian version of the channel with side information known 
at the transmitter (Fig|3]l. 

It is seen that the side information is considered as an 
additive interference at the receiver. Costa derived the capacity 
by using the result of Gel'fand-Pinsker theorem extended to 
random variables with continuous alphabets. The sketch of 
proof is as follows: In Costa channel Sf is a sequence of 
Gaussian i.i.d. random variables with power Q\. The trans- 
mitted sequence X n is assumed to have the power constraint 
E{X 2 } < P. The output Y" = X n + S? + Z n where Z n 
is the sequence of white Gaussian noise with zero mean and 
power N (Z ~ Af (0, N)) and independent of both X and Si. 
Costa established the capacity by obtaining a lower bound 
and an upper bound and proving the equality of these two 
bounds. Although there is no definite condition on correlation 
between the channel input X and the known interference 
Si in Costa channel, the achievable rate of \ log (l + 77) is 
obtained by taking Si and X independent and the auxiliary 
random variable U in |5]) as U = a Si + X . On the other 
hand, it can be shown that: 



C< max [I(X,Y \ Si)] < -log 

P(a|si) I 



p 

N 



(6) 



so \ log (1 



, , . - s , , , is an upper bound for the capacity of channel 



and then the capacity of channel. What is surprising is that the 
capacity is independent of Si, and that the capacity is equal 
to the capacity of channel when there is no interference Si. 

II. A Capacity Theorem for Analyzing the Impact 
of Transmitter-Receiver Cognition on Channel 
Capacity 

In this section we define and investigate a Gaussian channel 
in presence of two-sided information known non-causally at 
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Fig. 4. Gaussian channel with correlated side information known at the 
transmitter and at the receiver. . 



the transmitter and at the receiver. The side information at 
the transmitter and at the receiver is considered as addi- 
tive interference at the receiver (Fig|4j). In comparison with 
Costa channel, our channel has two major modifications: 1) 
In Costa channel there is no condition for the correlation 
between the channel input X and the side information Si. So 
\ log (I + 77) is the capacity of a channel in which the side 
information Si can be freely correlated to the channel input X; 
so this capacity can not be used for a channel with a specific 
correlation between X and S\. The correlation coefficient 
PxSx between X and Si is specified in our channel. 2) We 
suppose that the Gaussian side information S 2 known at the 
receiver, exists and is correlated to the channel noise Z. 

It is important to note that assuming the input random 
variable X and Si correlated to each other with a specific 
correlation coefficient, does not impose any restriction on X's 
own distribution and the distribution of X is still free to 
choose. 

Definition of the Channel 

Consider the Gaussian channel depicted in Fig|4] Our chan- 
nel is defined with properties D.1-D.3 below: 

D.l: (Si ,6*2) are i.i.d. sequences with zero mean and 
jointly Gaussian distributions. 

D.2: Random variables (X, Si, S2) have the covariance 
matrix K: 

\ <4 

yxo~s 2 Pxs2 VS 1 °~S 2 PS 1 S2 
We suppose that S2 is independent of X and Si, so we have 
Pxs 2 = PS 1 S 2 = 0. Moreover X n is assumed to have the 
constraint E {X 2 } = o\ < P. All values in K except 
ax, are fixed and must be considered as the definition of the 
channel. 

D.3: The output sequence Y n = X n + Sf + 5 2 l + Z n , 
where Z n is the sequence of white Gaussian noise with zero 
mean and power o\ = N (Z ~ Af (0, N)) and independent 
of (X, Si) and dependent on S2 with ps 2 z- F° r simplicity, we 
define: 

L 2 = E {S 2 Z} = as 2 crzps 2 z- (8) 



K = 



vxcrsipxsi 



o~x<Js 2 pxs 2 
°'s 1 o~s 2 Ps 1 s 2 



(7) 



DA: (X, U, Si,S 2 ) form the Markov Chain S 2 -> Si -> 
UX. (We note that as mentioned earlier, this Markov chain 
^ must be satisfied by all distributions p (y, x,u : si, s 2 ) in 
Cover-Chiang capacity theorem and is physically acceptable). 

It is readily seen that all distributions p (y, x, u, si, s 2 ) 
specified with D. 1-D.4 are in the form of Q and hence we can 
use the extended version of Cover-Chiang theorem to random 
variables with continuous alphabets about the capacity of this 
channel. 
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Comparing our channel (defined with D.1-D.4) with Costa 
channel , a question may arise: (if we ignore S 2 ,) what is the 
relationship between capacities of these channels? To answer 
this question let us consider a subset of all distributions 
(channels) p(y,x,u,sx) (ignoring S 2 ) that satisfy D.1-D.4 
and are similar but with different pxsi ■ Since Costa channel 
imposes no restriction on pxSi> these channels differ from the 
corresponding Costa channel on the restricted pxs ± - It is clear 
that searching for the capacity of the Costa channel is led to 
the maximum capacity in this subset. So if Cd be the capacity 
of the channel defined with D.1-D.4, and C be the capacity 
of the Costa channel, we can write: 



C = max Cd- 

Pxs 1 ,ps 2 z=0 



(9) 



We will show that the situation that (X, Si, S2) are jointly 
Gaussian and the auxiliary random variable U is designed as 
linear combination of X and Si, is optimum and maximizes 
the transmitting rate. So we consider an important subset of 
the distributions p (y, x, u, sx, s 2 ) defined in D.1-D.4, as the 
set of all p* (y,x,u, Si, S2) that have the properties D.5 and 
D.6 below, in addition to D.1-D.4 (although the channel is 
defined only with D.1-D.4) : 

D.5: Random variables {X, Si, S2) are jointly Gaussian 
distributed. X is with zero mean and has the maximum power 
of P (so X ~ M (0, P)). Naming the covariance matrix in this 
special case as K* , for simplicity, by defining A X =E {XSi} 
, Qi = a 2 Si and Q2 = cr| 2 , we rewrite: 



K* = 



P Ax 0' 
Ai Qi 
Q 2 



(10) 



D.6: Following Costa, we consider U in the form of 
linear combination of X and Si as U = aSi + X. 

For summarizing expressions, we define two following 
symbols: 



±PQx-Al=al*l (l- PxSl ] 



dp Ql ±Q 2 N-L 2 2 = * 2 S2 a 2 z (l-p% 2 



(11) 
(12) 



information that increases the capacity, and hence subtracting 
S 2 is a wrong decoding strategy. 

Corollary 3: It is seen that while, as intuitively expected, 
correlation between S 2 and Z increases the capacity, the 
correlation between X and Si decreases it. 

Proof of Theorem 1: To prove the theorem, we first show 
that Cd ( p"3j ) is a lower bound for the capacity of the channel, 
then we show that Cd is an upper bound for the capacity too, 
so Cd is the capacity of the channel. 

Achievability part of the proof: we use the extended version 
of Cover-Chiang capacity ([T} to obtain a lower bound for the 
capacity of the channel: For all distributions p (y, x, u, Si, s 2 ) 
(with properties D.1-D.4) and its subset p* (y, x, u, sx, s 2 ) 
(defined with properties D.1-D.6), we can write: 

C= max [I(U;Y,S 2 )-I(U;Sx)} (14) 

p(u,x\sx) 

> max [I(U;Y,S 2 )-I(U;Si)} (15) 

p*(u\x,si)p*(x\si) 

(16) 
(17) 



--maK[I(U;Y,S 2 )-I(U;Si)} 

a 

=maxi?D (a) = Rd (a*) ■ 



So Rd (a*) is a lower bound for the capacity of the channel. 
To compute Rd (a) we write (details of computations are 
omitted for the brevity): 

I (U; Y, S 2 )=H (U) + H (Y, S 2 ) - H (U, Y, S 2 ) , (18) 
I(U;Sx)=H(U)+H(Sx)-H(U,Sx), (19) 



where 



H (Y, S 2 ) = X - log ((2^e) 2 det (cov (Y, S 2 ))) 



\og[(2TTe) 2 (Q 2 (P + Qi+2Ai)+dp Ql ] 



(20) 



H (U, Y, S 2 ) = - log (^e) 3 [ d PQl (a 2 Qi + 2aAi + P) 

+ (a-l) 2 g 2 d Q2 ]Y (21) 



Capacity of the Channel 

Theorem 1: The Gaussian channel defined with properties 
D.1-D.4 has the capacity 



CD -2 l0g { 1+ N(l- pkz )) 



(13) 



Corollary 1: As mentioned earlier, by |9]) we can obtain 
Costa capacity by assuming ps 2 z = and maximizing Cd 
with pxs l =0. 

Corollary 2: It is seen that if the side information S 2 is 
independent of the channel noise Z (and so ps 2 z = 0), the 
capacity of the channel is equal to the capacity when there is 
no interference S 2 . In other words, in this case the receiver can 
subtract the known S 2 from the received Y" without losing 
any worthy information. But when the state information S2 
is correlated with additive noise Z, S2 is containing worthy 



=- log ((27re) 1 



H(Sx) 

H{U,Si)=Uog((2Ke) 2 d Q2 

Substituting |20]l-((23} in ([18} and ([19}, we obtain: 

Rd (a) = 



log 



and after maximizing it over a, we conclude: 

a * _ QidQ 2 - Aid PQl 
QidQ 2 — QidpQ ± 



(22) 
(23) 

(24) 



d Q2 (ga (P + Q1+ 2Ai) + d PQl ) 
h ((a - l) 2 Q 2 dq 2 + d PQl (a 2 Qi + 2aAi + P)) 



(25) 



Now, if we compute Rd (a*) by putting (25i into (24i and 
then rewrite the resulted expression in terms of ax, o~s 17 &s 2 , 



PxSi< Ps 2 z by dSll and (10i-(12i we finally conclude: 



1 



R D (a*) = -log 1 



PI 



PxsJ 



Nil 



Ps 2 z) 



(26) 
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Converse part of the proof: For all distributions 
p (y, x, u, si, s 2 ) defined with properties D.1-D.4, we have: 

/ (U; Y, 5 2 ) - I (U; Si)=-H (U \ Y, S 2 ) + H (U | S x ) 

<I(X;Y\S U S 2 ) (27) 

where ( p7| ) follows from Markov chains 52 —> Si — > UX 
and U — > XS1S2 — > Y, which are true for all distributions 
defined with properties D.1-D.4. Now from ([T]i and (27i we 



can write: 



C= max [I(U;Y,S 2 )-I(U;S 1 )} 

p(u,x\si) 



(28) 



< max [/ {X; Y \ S 1 ,S 2 )) = /* (X; Y \ S u S 2 ) , (29) 

p(x|si) 

hence 7* (X; Y | Si, S 2 ) is an upper bound for the capacity 
of the channel. For computing it we write: 

I(X;Y\ Si,S 2 ) 

= H ((X + Z) ,Si,S 2 ) - H (Si,S 2 ) - H (Z I S 2 ). (30) 

So when p9| reaches to its maximum, (X, Si,S 2 ) are jointly 
Gaussian and X has its maximum power of P and it means 



that I* (X; Y \ Si, S 2 ) is the value of (30 1 which is computed 
for distributions p* (y,x,Si,s 2 ) defined with properties D.l- 
D.6. After computing we have: 



H ((X + Z) , Si, S 2 )=- log {(27ve) 3 (Q 2 d Q2 + Qid PQl )) (31) 



H(S 1 ,S 2 ) = -log((2ne) 2 Q 1 Q 2 ) 
H(Z,S 2 ) = Uog((2nefd PQl ) 



so we obtain from ([30 



I* {X;Y I Si,S 2 ) 




Q2dQ 2 + QidpQ 1 



Q\dpQ 1 

P(^-P 2 x Sl ) 
N ^-Ps 2 z) 



(32) 
(33) 

(34) 
(35) 



where (35i follows by rewriting (34i in terms of ax, csj, 



°~s 2 , Pxsn Ps 2 z by J8) and ([T0|-([T2|i. 

From §26\ and $5) , we conclude that Cry ([13]) is the 
capacity of the channel. □ 

III. Numerical Results 

Figj5] illustrates the impact of the correlation of S2 and the 
channel noise Z on the channel capacity. Figure plotted for 
independent X and Si (so pxsx = 0). It is seen that the more 
S2 depends on the noise Z , the greater capacity of channel is. 
On the condition of full dependency ps 2 z = ±1 the capacity 
of channel is infinite. 

IV. Conclusion 

We investigated the Gaussian channel in the presence of 
two-sided state information with dependency on the input 
and the channel noise. Having established a capacity theorem 
for the channel, we illustrated the impact of the receiver 
cognition (the correlation between the channel noise and state 
information known at the receiver) and the correlation between 
the input and the side information known at the transmitter, 
on the capacity of the channel. 




Fig. 5. The impact of the correlation of state information S2 and noise Z 
on the capacity of the channel. Figure plotted with pxSi = 0. 
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